103 research outputs found

    Thinking spatial

    Get PDF
    The systems community in both academia and industry has tremendous success in building widely used general purpose systems for various types of data and applications. Examples include database systems, big data systems, data streaming systems, and machine learning systems. The vast majority of these systems are ill equipped in terms of supporting spatial data. The main reason is that system builders mostly think of spatial data as just one more type of data. Any spatial support can be considered as an afterthought problem that can be supported via on-top functions or spatial cartridges that can be added to the already built systems. This article advocates that spatial data and applications need to be natively supported in special purpose systems, where spatial data is considered as a first class citizen, while spatial operations are built inside the engine rather than on-top of it. System builders should consider spatial data while building their systems. The article gives examples of five categories of systems, namely, database systems, big data systems, machine learning systems, recommender systems, and social network systems, that would benefit tremendously, in terms of both accuracy and performance, when considering spatial data as an integral part of the system engine

    Detection and tracking of discrete phenomena in sensor-network databases

    Get PDF
    This paper introduces a framework for Phenomena Detection and Tracking (PDT, for short) in sensor network databases. Examples of detectable phenomena include the propagation over time of a pollution cloud or an oil spill region. We provide a crisp definition of a phenomenon that takes into consideration both the strength and the time span of the phenomenon.We focus on discrete phenomena where sensor readings are drawn from a discrete set of values, e.g., item numbers or pollutant IDs, and we point out how our work can be extended to handle continuous phenomena. The challenge for the proposed PDT framework is to detect as much phenomena as possible, given the large number of sensors, the overall high arrival rates of sensor data, and the limited system resources. Our proposed PDT framework uses continuous SQL queries to detect and track phenomena. Execution of these continuous queries is performed in three phases; the joining phase, the candidate selection phase, and the grouping/output phase. The joining phase employs an in-memory multi-way join algorithm that produces a set of sensor pairs with similar readings. The candidate selection phase filters the output of the joining phase to select candidate join pairs, with enough strength and time span, as specified by the phenomenon definition. The grouping/ output phase constructs the overall phenomenon from the candidate join pairs. We introduce two optimizations to increase the likelihood of phenomena detection while using less system resources. Experimental studies illustrate the performance gains of both the proposed PDT framework and the proposed optimizations

    SOLE: Scalable On-Line Execution of Continuous Queries on Spatio-temporal Data Streams

    Get PDF

    On Query Processing and Optimality Using Spectral locality-Preserving Mappings

    Get PDF

    Scalable continuous query processing in location -aware database servers

    No full text
    The wide spread use of cellular phones, handheld devices, and GPS-like technology enables location-aware environments where virtually all objects are aware of their locations. Location-aware environments and location-aware services are characterized by the large number of moving objects and large number of continuously moving queries (also known as spatio-temporal queries). Such environments call for new query processing techniques that deal with the continuous movement and frequent updates of both spatio-temporal objects and spatio-temporal queries. This dissertation, presents novel paradigms and algorithms for efficient processing and scalable execution of continuous spatio-temporal queries in location-aware database servers. We introduce a disk-based framework that exploits shared execution and incremental evaluation paradigms. With shared execution, the problem of evaluating a set of concurrent continuous queries is abstracted to a spatial join between the set of moving objects and the set of moving queries. With the incremental evaluation, rather than performing a repetitive evaluation of continuous queries, we produce only the updates of the recently reported answer. For streaming environments, we introduce a generic class of spatio-temporal operators that can be tuned with a set of parameters and methods to act as various continuous spatio-temporal queries (e.g., range queries and k-nearest-neighbor queries). The spatio-temporal operators can be combined with other traditional operators (e.g., join, distinct, and aggregate) to support a wide variety of continuous spatiotemporal queries. To support scalability in steaming environments, we introduce a salable operator that shares memory resources among all outstanding continuous queries. To cope with intervals of high arrival rates of data objects and/or continuous queries, the proposed scalable operator utilizes a self-tuning approach based on load-shedding where some of the stored objects are dropped from memory. The experimental evaluation of our disk-based approach compares with recent scalable approaches and shows the superior performance of our techniques. Also, we experimentally evaluate our spatio-temporal operators based on a real implementation inside an open-source data stream management system. The experimental results show that by delving inside the database engine and providing pipelined operators for continuous spatio-temporal queries, we can achieve performance orders of magnitude better than other application level algorithms

    Performance of Multi-Dimensional Space- Filling Curves

    Get PDF
    corecore